A Maximum Entropy Approach to Semi-supervised Learning
نویسندگان
چکیده
Various supervised inference methods can be analyzed as convex duals of a generalized maximum entropy framework, where the goal is to find a distribution with maximum entropy subject to the moment matching constraints on the data. We extend this framework to semi-supervised learning using two approaches: 1) by incorporating unlabeled data into the data constraints and 2) by imposing similarity constraints based on the geometry of the data. The proposed approach leads to a family of discriminative semi-supervised algorithms, that are convex, scalable, inherently multiclass, easy to implement, and that can be kernelized naturally. Experimental evaluation of special cases shows the competitiveness of our methodology.
منابع مشابه
Semi-Supervised Learning via Generalized Maximum Entropy
Various supervised inference methods can be analyzed as convex duals of the generalized maximum entropy (MaxEnt) framework. Generalized MaxEnt aims to find a distribution that maximizes an entropy function while respecting prior information represented as potential functions in miscellaneous forms of constraints and/or penalties. We extend this framework to semi-supervised learning by incorpora...
متن کاملGraph Based Semi-Supervised Approach For Information Extraction
Classification techniques deploy supervised labeled instances to train classifiers for various classification problems. However labeled instances are limited, expensive, and time consuming to obtain, due to the need of experienced human annotators. Meanwhile large amount of unlabeled data is usually easy to obtain. Semi-supervised learning addresses the problem of utilizing unlabeled data along...
متن کاملSemi-supervised learning for text classification using feature affinity regularization
Most conventional semi-supervised learning methods attempt to directly include unlabeled data into training objectives. This paper presents an alternative approach that learns feature affinity information from unlabeled data, which is incorporated into the training objective as regularization of a maximum entropy model. The regularization favors models for which correlated features have similar...
متن کاملA Rate Distortion Approach for Semi-Supervised Conditional Random Fields
We propose a novel information theoretic approach for semi-supervised learning of conditional random fields that defines a training objective to combine the conditional likelihood on labeled data and the mutual information on unlabeled data. In contrast to previous minimum conditional entropy semi-supervised discriminative learning methods, our approach is grounded on a more solid foundation, t...
متن کاملSemi-Supervised Learning Based Prediction of Musculoskeletal Disorder Risk
This study explores a semi-supervised classification approach using random forest as a base classifier to classify the low-back disorders (LBDs) risk associated with the industrial jobs. Semi-supervised classification approach uses unlabeled data together with the small number of labelled data to create a better classifier. The results obtained by the proposed approach are compared with those o...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2010